Support providing multi config files #336

pearsonca · 2024-10-09T18:53:21Z

Describe your changes.

This PR introduces support for using multiple configuration files. Briefly, commands like

flepimop COMMAND -c config.yml

will now (and going forward, more easily) support

flepimop COMMAND config1.yml config2.yml config3.yml ...

n.b. the -c flag is no longer required for specifying config files.

This PR adds the patch command:

flepimop patch config1.yml config2.yml > config_combined.yml

which will produce the monolithic version of config file(s). Also, the specific options (e.g. write-csv, run_id, ...) are now also incorporated into the config; patch will capture those options and incorporate them into the resulting config file.

This PR deprecates gempyor-simulate ..., in favor of flepimop simulate ...; gempyor-simulate ... still works, but issues a deprecation warning. The -c option is not yet deprecated by this PR, but we anticipate it will be in the future.

Documentation Updates

This PR includes new example partial YAML configurations. The gitbook documentation is updated to replace gempyor-simulate with flepimop simulation and removes the -c from those examples. There is also a new gitbook documentation section added explaining the use of multiple configuration files.

flepimop/gempyor_pkg/src/gempyor/cli.py

pearsonca · 2024-10-10T14:00:54Z

@jcblemai i addressed our in person discussion issues:

individual component cli is pushed back out to the "implementing" class for compartments
i removed the deprecation warning for -c. it still issues a warning when the arguments-based approach is used, indicating that new approach overrides the -c approach

pearsonca · 2024-10-11T20:49:17Z

Righto, the CI issues appear to be passing now.

There's a lot of faff going on with passing around config. Troubleshooting the problems here hit against that design decision several times.

TimothyWillard

Looks like a good start, left comments on the big things, but I certainly missed smaller things. The big points I want to emphasize are:

It doesn't look like there were any unit tests added for any of the new functionality which would be really helpful for maintaining the new features over time,
There is a general lack of documentation/type hints which would be helpful in understanding what the new functionality does and allowing others to integrate it into their work,
Lots of small stylistic changes, but importantly ones that help clean up the diff would be helpful for reviewing,
A formatter like black is recommended for new files, and
It's not clear to me that these changes are backwards compatible, if anything the changes to examples/test_cli.py suggest that they are not, which I think would create a large hurdle for operations. Is there a way to make these changes backwards compatible? Or if they are backwards compatible can the changes to examples/test_cli.py reflect that by providing the config both ways for all of the fixtures?

flepimop/gempyor_pkg/src/gempyor/cli.py

TimothyWillard · 2024-10-14T13:22:31Z

flepimop/gempyor_pkg/src/gempyor/cli.py

+# Guidance for extending the CLI:
+# - to add a new small command to the CLI, can just add a new function with the @cli.command() decorator here (e.g. patch below)
+# - to add something with lots of module logic in it, should define that in the module (e.g. compartments for a group command, or simulate for a single command)
+# - ... and then import that module here to add it to the CLI


This should probably go into the wiki instead of inline in the file.

maybe - I think it's useful to have at hand when editing, though I agree we should be updating a contribution / style guide as well.

I still think putting in the wiki would be preferable, it would provide more space to give commentary and even work through a small example.

flepimop/gempyor_pkg/src/gempyor/cli.py

flepimop/gempyor_pkg/src/gempyor/config_validator.py

TimothyWillard · 2024-10-14T13:27:26Z

flepimop/gempyor_pkg/src/gempyor/model_info.py

@@ -142,7 +142,7 @@ def __init__(

            # SEIR modifiers
            self.npi_config_seir = None
-            if config["seir_modifiers"].exists():
+            if config["seir_modifiers"].exists() and self.seir_modifiers_scenario is not None:


What was the need that caused this change (as well as the similar one for outcome modifiers)? Is there a corresponding unit test to go with this change?

So the problem here is passing both the config and the scenario. These are merged at the CLI boundary, but then internally, the whole config is still passed around and subset by X_modifiers_scenario

(this is definitely a get-this-working change - there needs to be a more thorough overall here, which would obviate this problem)

flepimop/gempyor_pkg/src/gempyor/shared_cli.py

TimothyWillard · 2024-10-14T13:51:03Z

flepimop/gempyor_pkg/src/gempyor/deprecated_option.py

Documentation, type hints, and __all__?

I might just prune this out, since I don't use it (@jcblemai didn't want to mark -c deprecated). That said, it would be useful for deprecated options to the CLI - what do folks think?

If this isn't actually used then delete it. If we need it in the future we can dig it up again, but I'd rather not maintain something that isn't used.

flepimop/gempyor_pkg/src/gempyor/simulate.py

TimothyWillard · 2024-10-14T13:56:17Z

examples/test_cli.py

-  result = runner.invoke(simulate, ['-c', 'config_sample_2pop.yml'])
+  result = runner.invoke(simulate, ['config_sample_2pop.yml'])


Are the changes backwards compatible? I don't think that was entirely clear to me, and this seems to indicate that it is not. If it is, can this test be parameterized such that it hits both cases to prove backwards compatibility.

they are - note line 33 below. I changed these to demonstrate the new capabilities, and left the last one to demo backwards compatibility.

i've also added a test demonstrating the re-route of gempyor-simulate

It demonstrates the new capabilities, but I don't think it demonstrates backwards compatibility. I think ideally these tests would be parameterized such that they're invoked via the old method and the new method for each fixture.

Fair enough. Let's chat later today about the right testing regime here - I don't think we should need to reach all the way into execution to confirm that the resulting configs / arguments passed are identical. Might warrant a bit of re-architecture / reframing of tests generally to do that, however.

I don't think we should need to reach all the way into execution to confirm that the resulting configs / arguments passed are identical.

Maybe, I think I would be more swayed in that not being needed if there were other unit tests to accompany this PR.

Agree - but I also think it's going to entail re-orienting the click approach a bit. Right now, parse happens in the commands, rather than in the main group, then dispatched down.

I think its definitely preferrable to centralize this even more (i.e. not just in one function, but into one call of that function), but that's maybe a more radical overhaul of the function signatures.

TimothyWillard · 2024-10-14T14:13:06Z

It's not clear to me that these changes are backwards compatible, if anything the changes to examples/test_cli.py suggest that they are not, which I think would create a large hurdle for operations. Is there a way to make these changes backwards compatible? Or if they are backwards compatible can the changes to examples/test_cli.py reflect that by providing the config both ways for all of the fixtures?

Actually, looking again more closely I think it is backwards compatible, but I think my confusion at first glance highlights the need to cover both cases a bit more in that particular file as well as add documentation generally.

pearsonca · 2024-10-14T19:47:42Z

Ready for re-review

It doesn't look like there were any unit tests added for any of the new functionality which would be really helpful for maintaining the new features over time,

I added items to test_cli - do you mean for the internal functions, like the option / config merger?

There is a general lack of documentation/type hints which would be helpful in understanding what the new functionality does and allowing others to integrate it into their work,

Added some of this, but could use some more perspective on what else is needed.

Lots of small stylistic changes, but importantly ones that help clean up the diff would be helpful for reviewing,

Did these.

A formatter like black is recommended for new files, and

Applied it.

It's not clear to me that these changes are backwards compatible, if anything the changes to examples/test_cli.py suggest that they are not, which I think would create a large hurdle for operations. Is there a way to make these changes backwards compatible? Or if they are backwards compatible can the changes to examples/test_cli.py reflect that by providing the config both ways for all of the fixtures?

It should be backwards compatible; I'll have a think about how best to include all the ways. Will mean re-assembling the full configs, I guess.

pearsonca · 2024-10-14T19:52:20Z

Re still failing unit test case - not sure what the deal there is. Works locally?

saraloo · 2024-10-14T19:54:02Z

Haven't gone through this in full detail, but re: the tutorials, I would lean towards leaving the current configs as are so they are standalone readable, and including an example where you patch two configs together.

pearsonca · 2024-10-14T20:23:02Z

Haven't gone through this in full detail, but re: the tutorials, I would lean towards leaving the current configs as are so they are standalone readable, and including an example where you patch two configs together.

Reverted back to the full ymls - where do the examples live? Happy to add there.

saraloo · 2024-10-14T20:33:58Z

Haven't gone through this in full detail, but re: the tutorials, I would lean towards leaving the current configs as are so they are standalone readable, and including an example where you patch two configs together.

Reverted back to the full ymls - where do the examples live? Happy to add there.
Sorry, meant in the same folder - could just add your .part versions and and in any eventual documentation note that an original config + additional part is the equivalent to one of the full configs, or something like that? Pretty straight forward, but i think there's value in keeping the tutorial/example configs that have all the pieces.

pearsonca · 2024-10-14T20:43:07Z

Sorry, meant in the same folder - could just add your .part versions and and in any eventual documentation note that an original config + additional part is the equivalent to one of the full configs, or something like that? Pretty straight forward, but i think there's value in keeping the tutorial/example configs that have all the pieces.

Alright - so the latest is good then? Happy to modify any existing examples, but it's hard to know which are relevant / where they are.

TimothyWillard

The improvements look good, but I think we're still missing documentation and unit tests which are essential to maintaining this functionality over time and covering edge cases now. Here are some PRs with good test/documentation changes to use as examples:

We're pushing toward the Google style guide for documentation, and expanded set of examples can be found here provided by Napoleon.

And on the unit testing front I think at a minimum unit tests for cli, option_config_files, and parse_config_files need to be implemented. See pytest's getting started guide or the PRs above for examples.

TimothyWillard · 2024-10-15T12:15:35Z

examples/test_cli.py

-  result = runner.invoke(simulate, ['-c', 'config_sample_2pop.yml'])
+  result = runner.invoke(simulate, ['config_sample_2pop.yml'])


It demonstrates the new capabilities, but I don't think it demonstrates backwards compatibility. I think ideally these tests would be parameterized such that they're invoked via the old method and the new method for each fixture.

TimothyWillard · 2024-10-15T12:16:20Z

examples/tutorials/config_sample_2pop_interventions_test.part

Can the .yml extension be added to the file name for syntax highlighting? This looks like valid YAML on its own. And same for the other .part files.

sure, I'll pull the part indicator into the file name.

TimothyWillard · 2024-10-15T12:17:50Z

flepimop/gempyor_pkg/src/gempyor/shared_cli.py

+"""
+An internal module to share common CLI elements.
+"""


Module docstrings must go at the top.

TimothyWillard · 2024-10-15T12:18:55Z

flepimop/gempyor_pkg/src/gempyor/shared_cli.py

+argument_config_files = click.argument(
+    "config_files", nargs=-1, type=click.Path(exists=True)
+)
+""" `click` Argument decorator to handle configuration file(s) """


Python does not support docstrings for variables, can just make these comments.

TimothyWillard · 2024-10-15T12:22:51Z

flepimop/gempyor_pkg/src/gempyor/shared_cli.py

+    config_files: list[str],
+    config_filepath: str,


Not a str, is a pathlib.Path. At least that's what the corresponding click argument/option suggests, you could coerce to a string before providing to this function.

TimothyWillard · 2024-10-15T12:23:53Z

flepimop/gempyor_pkg/src/gempyor/shared_cli.py

+    first_sim_index: int,
+    stoch_traj_flag: bool,
+) -> None:
+    """Parse the configuration file(s) and override with command line arguments"""


Documentation?

Hmm, what is the right way to avoid double work here? The click options are in my mind the right place to have the documentation, and this def is not intended to public interface. I don't want us to get into double maintaining / inconsistency between the allowed options / arguments to this.

along those lines: I propose a separate issue/PR here for a bit more focused refactor here to make options and such be a bit more dynamic. Swap to a kwargs-y approach, make the options list defined earlier in the file be named and use those names in the loops.

Does that seem reasonable?

I think the way to avoid double work here is to put the documentation with this function. That's fine that they're not intended for the public interface (we should start denoting internal with a leading underscore if we're going to take care to define an external vs internal interface at this stage), but that doesn't exclude it from requiring documentation. We still need to know how to use the function internally when building out future CLI interfaces.

I'm not sure I follow the second comment that well. Why would we move to kwargs for parse_config_files if we're providing them every time (at least from what I see in current usage) anyways, do we anticipate only providing a few of these args?

so if parse_config_files is kwargs-driven, then effectively the arguments can be driven by the names set in the arguments/options objects. Then whatever we do with those options in the future (and maybe that other people do with options for plugins), the parse function "just works". that clearer?

Oh, you mean using **kwargs. Sure, that could be done now or in a future PR doesn't matter too much either way, but it doesn't alleviate the need for documentation. One downside of **kwargs is how to construct a valid call to this function becomes a bit more ambiguous since the valid arguments are not enumerated in the function definition itself. That could be addressed by well maintained documentation though.

pearsonca · 2024-10-15T18:38:12Z

Summarizing conversation @TimothyWillard and I had:

add to test_cli using the patch action, and confirm that it matches the non-partitioned file
introduce unit testing for parse_config_files, checking that the purportedly supported options can all be reflected in the config file
otherwise prune the test_cli back to reflect more of its prior checks
exploring changing the approach in shared_cli to use kwargs in parse_config_files (TBD if this makes sense, after seeing what is possible with documentation, argument checking, etc.)

pearsonca · 2024-10-16T13:40:14Z

exploring changing the approach in shared_cli to use kwargs in parse_config_files (TBD if this makes sense, after seeing what is possible with documentation, argument checking, etc.)

alright @TimothyWillard - how's the rough in strike you?

TimothyWillard

Still missing unit tests for parse_config_files. I'm a bit concerned about:

How the documentation for parse_config_files is formatted as well as the parameters being given. I thought the take away was to use **kwargs rather than specify exact args and then list what fields it would look for in a Notes: section or similar. The current documentation is oddly formatted and not super helpful from a developer prospective.
I'm also a bit confused about the --help view of the commands, for example the simulate command's output has expanded with a lot of extra info that isn't helpful, I think coming from the click_helpstring decorator, but on the other hand the flepimop compartments subcommands help pages don't list any of the options/arguments? I think ideally we want a consistent display with the acceptable options/arguments?

examples/test_cli.py

flepimop/gempyor_pkg/src/gempyor/shared_cli.py

flepimop/gempyor_pkg/src/gempyor/simulate.py

flepimop/gempyor_pkg/src/gempyor/shared_cli.py

TimothyWillard

Left a few minor comments, but overall I'm excited about this PR. I think this sets the stage for a very simple and cohesive CLI interface. The unit testing also makes this maintainable over time.

examples/test_cli.py

flepimop/gempyor_pkg/src/gempyor/shared_cli.py

TimothyWillard · 2024-10-21T22:00:10Z

flepimop/gempyor_pkg/tests/shared_cli/test_cli.py

So does this mean we can delete examples/test_cli.py then? As far as correcting the CI goes it would just require removing the "Run gempyor-cli integration tests from examples" step from .github/workflows/gempyor-ci.yml

I'd argue yes. @jcblemai any reason to keep the original version of this around?

I think we should still have code that runs the actual examples in the example folder, but we should rename it to test_examples perhaps ?

flepimop/gempyor_pkg/tests/shared_cli/test_parse_config_files.py

flepimop/gempyor_pkg/src/gempyor/shared_cli.py

Co-authored-by: Timothy Willard <[email protected]>

documentation/gitbook/how-to-run/multi-configs.md

saraloo · 2024-10-29T14:10:22Z

documentation/gitbook/how-to-run/multi-configs.md

+
+You may provide an arbitrary number of separate configuration files to combine to create a complete configuration.
+
+At this time, only `simulate` supports multiple configuration files. Also, the patching operation is fairly crude: configuration options override previous ones completely, though with a warning. The files provided from left to right are from lowest priority (i.e. for the first file, only options specified in no other files are used) to highest priority (i.e. for the last file, its options override any other specification).


Some examples would be informative to show the limitations. This could lead to very unexpected behaviour and since we have some external users, i think it would be useful to be explicit here about what this cannot do.

will amend this - how about addressing, say, new seir scenarios?

TimothyWillard · 2024-10-29T14:35:55Z

Noting Joseph's comment regarding shadowing. I think this needs to be tested or some specifics about what happens should be determined first, or at least a strict warning or error here. My instinct reading this was - great I can now add modifiers to a section that is pre-existing, but is this what happens? Would this ever fail, or remove preexisting configurations? From a run-to-run perspective this would be the most useful - the ability to just add parts within chunks, or even to specify how to merge two config sections. This could be moved to its own issue at a later date if we go with just issuing a warning if two config parts have the same sections.

We're thinking about how to be clever-er here, but a lot of that is going to rely on how you want it to work - there are some standards for this kind of thing (JSON patch syntax, e.g.) but they don't feel great as an approach to ask users to write. I think this is a "next issue" item, once people have had time to experiment and recognize the rough edges, try things they think should work / would like to work, etc.

I think in the future we can add flags that dictate this behavior. Say a --merge for merging modifiers together from multiple files or --no-override to avoid overriding sections, for example. I agree that it's best to determine this based on feedback, just providing an example of how to incorporate feedback in the future in a backwards compatible way.

pearsonca requested review from jcblemai, TimothyWillard, kjsato, saraloo, twallema and MacdonaldJoshuaCaleb October 9, 2024 18:53

pearsonca commented Oct 10, 2024

View reviewed changes

flepimop/gempyor_pkg/src/gempyor/cli.py Outdated Show resolved Hide resolved

saraloo removed the request for review from kjsato October 11, 2024 18:37

pearsonca force-pushed the multi-config branch from d00b97a to 200a950 Compare October 11, 2024 20:49

This was referenced Oct 12, 2024

new command flepimop-push, flepimop-pull #296

Open

Update test_cli.py #242

Closed

TimothyWillard requested changes Oct 14, 2024

View reviewed changes

pearsonca requested a review from TimothyWillard October 14, 2024 22:21

TimothyWillard requested changes Oct 15, 2024

View reviewed changes

TimothyWillard self-requested a review October 16, 2024 14:25

TimothyWillard requested changes Oct 16, 2024

View reviewed changes

pearsonca requested a review from TimothyWillard October 21, 2024 17:34

TimothyWillard previously approved these changes Oct 21, 2024

View reviewed changes

pearsonca dismissed TimothyWillard’s stale review via d642e26 October 22, 2024 01:56

pearsonca and others added 14 commits October 29, 2024 09:11

handle tuple case in shared_cli

7014938

full option testing

c9f917e

apply black formatting [skip ci]

0f879fd

WIP test patch [skip ci]

bf58e94

working patch test

d7beb69

Update flepimop/gempyor_pkg/src/gempyor/shared_cli.py

558ce4b

Co-authored-by: Timothy Willard <[email protected]>

Update flepimop/gempyor_pkg/src/gempyor/shared_cli.py

814f8a4

prune yaml import from test

3e6edb3

docstring generation tweaking

6424382

reconstitute simulate interface [skip ci]

96cec5b

issue warnings for config files with duplicate keys

51a4860

clarify click version requirement and remove string filtering

0dd4c59

correct click version

63f2759

update documentation to remove reference to gempyor-simulate [skip ci]

daebcdc

pearsonca force-pushed the multi-config branch from 1a8b6c7 to daebcdc Compare October 29, 2024 13:15

add multi-config documentation

cb91676

saraloo reviewed Oct 29, 2024

View reviewed changes

documentation/gitbook/how-to-run/multi-configs.md Show resolved Hide resolved

saraloo reviewed Oct 29, 2024

View reviewed changes

pearsonca requested review from saraloo, MacdonaldJoshuaCaleb, TimothyWillard and jcblemai October 29, 2024 14:15

update multi-configs gitbook to give caveat example

00591eb

TimothyWillard approved these changes Oct 29, 2024

View reviewed changes

pearsonca changed the base branch from main to dev October 29, 2024 15:07

pearsonca mentioned this pull request Oct 29, 2024

[Feature request]: multiple config files patching behavior enhancements #371

Open

saraloo approved these changes Oct 29, 2024

View reviewed changes

pearsonca merged commit 1f92081 into dev Oct 30, 2024
2 checks passed

jcblemai deleted the multi-config branch October 30, 2024 15:53

		result = runner.invoke(simulate, ['-c', 'config_sample_2pop.yml'])
		result = runner.invoke(simulate, ['config_sample_2pop.yml'])


		You may provide an arbitrary number of separate configuration files to combine to create a complete configuration.

		At this time, only `simulate` supports multiple configuration files. Also, the patching operation is fairly crude: configuration options override previous ones completely, though with a warning. The files provided from left to right are from lowest priority (i.e. for the first file, only options specified in no other files are used) to highest priority (i.e. for the last file, its options override any other specification).

Support providing multi config files #336

Support providing multi config files #336

Conversation

pearsonca commented Oct 9, 2024 • edited Loading

Describe your changes.

Documentation Updates

pearsonca commented Oct 10, 2024

pearsonca commented Oct 11, 2024

TimothyWillard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TimothyWillard Oct 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TimothyWillard commented Oct 14, 2024

pearsonca commented Oct 14, 2024

pearsonca commented Oct 14, 2024

saraloo commented Oct 14, 2024

pearsonca commented Oct 14, 2024

saraloo commented Oct 14, 2024

pearsonca commented Oct 14, 2024

TimothyWillard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pearsonca commented Oct 15, 2024 • edited Loading

pearsonca commented Oct 16, 2024

TimothyWillard left a comment

Choose a reason for hiding this comment

TimothyWillard left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TimothyWillard commented Oct 29, 2024

pearsonca commented Oct 9, 2024 •

edited

Loading

TimothyWillard Oct 15, 2024 •

edited

Loading

pearsonca commented Oct 15, 2024 •

edited

Loading